Plant disease recognition using residual convolutional enlightened Swin transformer networks

Agriculture plays a pivotal role in the economic development of a nation, but, growth of agriculture is affected badly by the many factors one such is plant diseases. Early stage prediction of these disease is crucial role for global health and even for game changers the farmer’s life. Recently, adoption of modern technologies, such as the Internet of Things (IoT) and deep learning concepts has given the brighter light of inventing the intelligent machines to predict the plant diseases before it is deep-rooted in the farmlands. But, precise prediction of plant diseases is a complex job due to the presence of noise, changes in the intensities, similar resemblance between healthy and diseased plants and finally dimension of plant leaves. To tackle this problem, high-accurate and intelligently tuned deep learning algorithms are mandatorily needed. In this research article, novel ensemble of Swin transformers and residual convolutional networks are proposed. Swin transformers (ST) are hierarchical structures with linearly scalable computing complexity that offer performance and flexibility at various scales. In order to extract the best deep key-point features, the Swin transformers and residual networks has been combined, followed by Feed forward networks for better prediction. Extended experimentation is conducted using Plant Village Kaggle datasets, and performance metrics, including accuracy, precision, recall, specificity, and F1-rating, are evaluated and analysed. Existing structure along with FCN-8s, CED-Net, SegNet, DeepLabv3, Dense nets, and Central nets are used to demonstrate the superiority of the suggested version. The experimental results show that in terms of accuracy, precision, recall, and F1-rating, the introduced version shown better performances than the other state-of-art hybrid learning models.

In order to achieve high yield and production, a number of favored tactics are applied to help with early disease prognosis.Environmental health indicators including pollution, tainted water, and unhealthy vegetation are taken into account as the collateral damage that affects human fitness [3][4][5] .Deep learning (DL) and machine learning (ML) have become more popular, and green techniques are intended to help farmers correctly diagnose plant diseases based on the severity of symptoms.The prevalence of plant disease diagnosis has changed as a result of the advancement of deep learning (DL) processes such as convolutional neural networks (CNN) 6 , recurrent neural networks (RNN) 7 , and deep notion networks (DBN) 8 .DL-based totally algorithms are able to automatically discover the deeper key elements of the plant life when used to localize items that may be observed [9][10][11] .
But while developing effective deep learning algorithms that can detect and analyze the plant disease effectively, researchers help identify important issues and hurdles.Some of them are as follows 1. High resolution camera is required for an efficient capture of Images 2. Environmental and device noises affects the leaf samples 3.More training time to diagnose the multiple-class detection in multiple plants 4. Classification of severity of symptoms in the plant disease remains to be real challenge 5. Complexity still prevails in achieving the best classification rate of diseases in the plants.
According to the challenges discussed above, this work proposes the novel ensemble of residual convolutional block with the Swin transformers to provide the better accuracy of detection with any circumstances of environment.The man contribution of this research work is concises as follows: 1. Develop an intelligent system for expertly identifying the diseases in the plant leaves using the novel residual Swin transformer networks (RST-Nets) which can be used as an early trigger for the plant disease recognition.2. Create the complexity aware residual networks with transformer to improve the network's ability to focus on both local and global aspects with contextual data that supports an efficient multiple-classification of plant diseases.3. Extensive experimentation is conducted using PlantVillage Datasets and performance metrics are calculated.
Results shows that the developed model is applicable for overcoming the above mentioned challenges.
The remaining of the essay is structured as follows: in "Associated works", the linked works were displayed.The dataset description, suggested technique, and background information on residual networks and Swin transformers are included in Sect.3. In Sect.4, the experiment, its findings, and its assessment are described.In "Conclusion and its future enhancement", the study concludes with a discussion of future directions.

Associated works
Kumar et al. 12 introduced the IoT-based leaf development estimation framework, gCrop using system learning, and computer vision approaches.For platforms with low resource availability, low-powered training models are used.The framework first determines the leaf 's aspect, then it calculates how long the leaves will last.The results show that, depending on the stage of the leaves, the suggested framework can achieve accuracy levels of 98-100%.Additionally, those suggest that the flora has much improved and there has been a moderate settling.The main limitation of this methodology is that it cannot capture the improvement over longer time periods because suitable datasets are not readily available 12 .
Understanding flora anomalies in nurseries or other herbal environments is the main objective of investigations by Shima Ramesh Maniyath.The received picture frequently wonders about a simple past to remove barriers.The technique is modified from current AI models for precision.A Random forest classifier was used to generate the model, which was built using 160 images of papaya leaves.The model could want to place an order with a 70% accuracy prediction.The precision can be accelerated by preparing with a huge range of photos and the usage of several local additives identical to the global ones, such as SURF (Speed Up Robust Features), and DENSE with BOVW (Bag Of Visual Word).The main disadvantage is that it can only be utilized for small datasets and is best suited for controlled harvests 13 .
A clear definition of plant diseases and the prevalence of pests is provided by Liu et al. in 2021, and they further a connection with traditional methods of plant infection and pest detection.This framework investigated plant diseases, pest detection techniques, and the advantages and disadvantages of segmenting the community.The results of the present research are contrasted with those from conventional databases.This evaluation, based on this premise, looks at capacity issues in typical applications of plant illnesses and pest identify dependent on DL.In addition, advice on how to resolve the problems is provided, along with some ideas and potential remedies.Finally, the review presents the test and the potential for future samples of plant diseases and bugs that will be discovered based on in-depth learning 14 .
A real-time selection aid machine linked to a camera sensor module was designed and planned by Paramasivam Alagumariappan et al. 15 to identify plant disease evidence.Additionally, three ML calculations, including the extreme learning machine (ELM) with direct and polynomial kernels and the support vector machine (SVM), were demonstrated and investigated.The ELM presentation is superior to the widely used SVM classifier, according to the findings.When compared to other classifiers, it can be demonstrated that the SVM approach's polynomial component's sensitivity is superior.Due to real-time electronics that can detect various plant illnesses, this artwork gives off the impression of being of high pleasant pertinence.The drawback of this structure is that it requires a lot of time for schooling 15 .
In 2020, Ramya et al. introduced a tool to assist farmers in identifying the types of ailments that are affecting their crops.The shot was altered using MATLAB, and the leaf situation was connected to NN classification assistance.Then, it was checked how the climate was faring in terms of temperature, wetness, and humidity.After handling the photo, the product sends an SMS to the customer using global system for mobile (GSM) technology.The SMS contains information on the leaf kingdom, a particular treatment, and environmental factors.The siphon will turn on in the event that the botanical scenario is odd.This suggested framework provides a summary of the class and an AI-based system for fully detecting plant leaf diseases.A group of artificial neurons are scattered across at least three layers in the ML space to form the foundation of the subclass of calculations known as NN.This device's drawback is that it increases the complexity of the device and calls for a lot of memory to handle the plant images 16 .
Chowdhury et al. 17 issued a warning regarding the usage of 18,161 pictures of plain and dissected tomato leaves with a DL design that was built mostly on a unique CNN called EfficientNet to learn about tomato diseases.The division fashions for the U-net and Modified U-net are taken into consideration for the department of leaves.With the modified U-internet division model, the division of leaf images produced precision, IoU, and dice ratings of 98.66%, 98.5%, and 98%, respectively.Using divided pictures, EfficientNet B4 completed ten-magnificence characterisation with a precision of 99%.All of the structures were thought to perform better at diagnosing the illnesses when they were developed with deeper networks using divided snapshots.A snapshot can typically only contain one type of lesion since lesions need to reflect a specific volume in the image, despite the fact that their characteristics are conveniently related out 17 .
It is possible to continuously forecast 25 different disease categories in tomatoes, Apple, Grape, Peach, Potato, and Strawberry using the deep model developed by Khan et al. 18 and implemented on AWS DeepLens.The accuracy for the real-time environment for this structural version was 98.78%.By utilizing it as soon as the primary issue of plant (leaf) ailments may be detected, this pragmatic approach could benefit society, professionals in the field of agriculture, and the agri-economic system.This technique is flexible and might be used as a web-based database for organizing and classifying plant leaf disease differentiating evidence.Additionally improved with this gadget is computational complexity.On the off chance that the location accuracy is guaranteed, the model needs to unquestionably improve the picture quality and increase the computing load, which will inevitably result in sluggish identity speed and an inability to handle real-time issues 18 .
A system for detecting plant leaf illnesses was developed in 2022 by Varshney et al.It is based on deep learning algorithms.CNN is used as a characteristic extractor, and SVM is used for type.The benchmark dataset PlantVillage was used as a comparison in order to contrast the recommended approach.Accuracy is increased to 88.77% using this framework.The main weakness of this system, however, is its enormous processing burden 19 .
Latif et al. ( 2022) advanced a modified model of a VGG-19 positioned switch researching system in order to accurately recognize and diagnose six training, including the healthy rice leaf.Using images of leaves, this method can precisely identify five rice disorders.The dataset for rice leaves includes both healthy leaves and those suffering from the five distinct diseases black spots, bacterial leaf blight, leaf blasts, and thin brown spots.When using the modified encouraged technique, the non-normalized more appropriate dataset has the highest average accuracy (96.08%).0.9620, 0.9617, 09.921, and 0.9616 appear to have been the equivalent values for accuracy, recall, specificity, and F1-rating.When combined with IoT technology and set up on a drone, the system can quickly identify rice fever.The main issue with this device, however, is that performance suffers as dataset sizes increase, leading to poor performance 20 .
For the purpose of identifying plant diseases, Gosai et al. added the ResNet approach in 2022.To address disappearing or inflating gradient issues, the ResNet approach includes a residual block.Along with gradient clipping, time table studying fee, and weight decay, the ResNets algorithms used a number of the parameters.This paradigm has better results when it comes to properly diagnosing plant diseases.The extended training duration is this framework's primary drawback, but 21 .Table 1 also provides a quick summary of the literature review.

Dataset description
Regarding training and testing purposes in this study, PlantVillage, an open-access resource of photos on plant health to facilitate the creation of mobile diagnostic testing collected from source 22 , was utilized.The 54,306 photos in the PlantVillage dataset are from 14 distinct plants.There are a total of 38 classes, of which 26 show distinct plant diseases and 12 show varieties of plants with healthy leaves.Figure 2

Data augmentation process
The process of improving networks by utilizing better information to increase type accuracy is known as data augmentation.The updated version is more entertaining and may provide more image data for each plant institution.Each of the plant picture categories is expanded by the statistics augmentation method utilized in this study.

Feature extraction
This section details about the proposed hybrid model used for the feature extraction process.

Residual Swin transformers
ResNet was used to evolve the proposed structure because it had a solid foundation.After the 16th layer, swin transformers were shielded inside the 20-layer ResNet architecture.By reducing the overall network characteristics and simplifying the ResNET, the proposed community is designed to be lighter and more transportable.
Using batch normalization, a LeakyReLU activation layer, and a median pooling layer, the first convolution layer includes sixteen kernel filters as a result.These transformers, which are then followed by a Swin Transformer, take deeper capabilities from the inputs and feed them to the leftover block.The convolutional block is a component of the residual block, which produces residual blocks 2 and 3 and is observed by the average pooling layer and converts the 2D images into 1D functions.Because Fig. 3 shows the same location, the swin transformer is incorporated into the suggested network.
Swin transformers.The Swin transformer architecture is summarized in Fig. 4, which also shows how multiheaded self attention (MHSA) is used.The supplied RGB photo is split into distinct, non-overlapping patches by the patch splitting module first.Every patch is viewed as a token, and each patch's feature is a concatenation of the RGB values from its raw pixels.In this study, the patch size is 3 × 3, and the function dimension for each patch is 3 × 3 × 3, or 9.A linear embedding layer is used to project this uncooked-valued characteristic to any scale.The transformer frames with MHSA are put into use to get more functionality out of those patches.Patch merging layers narrow the range of tokens in a hierarchical representation as the community grows deeper.The first phase uses a patch merging layer to concatenate each institution's features, and the second uses the swin transformer with MHSA to convert the functions.To produce a more comprehensive depiction of hierarchical functions, this procedure is repeated twice.The Stage1, Stage2, and Stage three are taken into account because of these processes.To create the lossless features that result in a superior type mechanism, all the various capabilities are combined.Sliding window-based MHSA layers are introduced in order to obtain the additional nonoverlapping features.Each transformer is made up of the two sequential blocks, modified attention layers, and moving window areas, as shown in Fig. 4.

Classification layers
The very last layer of the suggested model modifies the classification of dense neural communities utilizing the quick severe mastering machines suggested by Huang 23 .A type of neural network known as an ELM employs a single hidden layer and functions on the principle of auto-tuning resources.When compared to other mastering models like Support vector machines (SVM), Bayesian Classifier (BC), K-Nearest Neighborhood (KNN), or even Random Forest, ELM exhibits higher performance, high speed, and minimum computing overhead.This kind of neural community has a hidden layer that does not always need to be tweaked.ELM makes use of the kernel feature to deliver accurate data and improved speed.The main advantages of the ELM are improved approximation and less training error.The specific functioning mechanism of the ELM is extensively discussed in 24 .The ELM's (after Capsule Network) input features maps are represented by where X-features from Transfer Capsule network, P is the features from the different type of capsule networks.
The symbol for the output ELM function is ELM's general training is provided by

Conclusion and its future enhancement
In this research article, novel ensemble of Swin transformers and residual networks integrated with feed forward networks are proposed.In the first stage, Swin and residual networks are used to extract the more deeper features to achieve the better extraction performances, whereas feed forward networks are adopted in the second stage to achieve the best prediction of multiple plant diseases.The extensive experimentation is carried out using the plant village datasets and performance metrics are calculated and compared with the existing hybrid deep  learning models.The results show that the recommended architecture outperformed other cutting-edge solutions, achieving accuracy levels of 99.95% with a 99.95% accuracy, a recall of 99.96%, a specificity of 99.95%, and a high-quality f1score of 99.95%.Although, the proposed model has produced the better performances still it is not suitable for resource constraint energy consuming devices due to its computational overhead.
As the future scope, proposed model needs its improvisation in reducing the computational complexity which can be deployed in the IoT-Edge devices to handle the more real time datasets.

Ethical approval
This article does not contain any studies with human participant and Animals performed by author.The article uses the benchmark datasets available in the kaggle to evaluate the proposed model.

Figure 1
Figure1shows the proposed framework for classification of multiple diseases from the multiple plant.The three main parts of the suggested methodology are type, characteristic extraction, and facts augmentation.The whole set of skills are shown in Fig.1.Records series and argumentation is the name of the first component.The second component is the characteristic extraction portion, and the class element makes up the final component.The number of plant images that may be employed in the information preprocessing process is increased, while the suggested model's characteristic extraction method and the type layer's dense extreme learning machines are both constructed.
illustrates the visual representation of healthy and disease sample plants such as (a) Healthy Apple (b) Pepper bell-bacterial Spot (c) Apple Black rot and (d) Tomato Diseases.

Figure 3 .
Figure 3. Proposed block diagram for the Swin transformer enabled ResNet topology.

Figures 5 and 6
Figures5 and 6presents the comparative evaluations of the different algorithms in detecting the multiple plant diseases.The advantage of the proposed model is clearly visualized since it produces the best uniform performance of classifying the multiple diseases from the multiple plants.The major advantage of the proposed model is the integration of residual connected swin transformers that enriches the feature extraction process by extracting the deeper features that aids for the better classification of plant diseases.Though the capsule networks and CAPSNET has produced the average performance of 99% and 98% respectively but the proposed model edged over these models in the classification of multiple plant diseases.Though the proposed model has produced the best performance, computational overhead may create light of complexity in deploying these models in the hardware.

Figure 5 .
Figure 5. Comparative investigation of the distinct algorithms in detecting healthy plant diseases.
Table 2 contains information about the entire dataset.

Table 1 .
Quick summary of literature survey.

Table 2 .
Plants diseases categorization with its annotated labels.

Table 3 .
Performance indicators utilized for evaluation.TPo and TNe true positive and negative, FPo and FNe false positive and negative.

Table 5 .
Performance evaluation of the developed model in detecting the diseases in apple plant.99%, recall of 99%, specificity of 99.0% and f1-score is 99.92% respectively.Moreover, the performance of the proposed model has shown the uniform performance of classifying the multiple diseases from the multiple plants. of

Table 6 .
Performance evaluation of the developed model in detecting the diseases in strawberry plant.

Table 7 .
Performance evaluation of the developed model in detecting the diseases in corn plant.

Table 8 .
Performance evaluation of the developed model in detecting the diseases in squash plant.

Table 9 .
Performance evaluation of the developed model in detecting the diseases in squash plant.

Table 10 .
Performance evaluation of the developed model in detecting the diseases in soyabean plant.

Table 11 .
Performance evaluation of the developed model in detecting the diseases in squash plant.

Table 12 .
Performance evaluation of the developed model in detecting the diseases in potato plant.

Table 13 .
Performance evaluation of the developed model in detecting the diseases in tomato plant.